%%HTML
<style>
.container{width:90% !important;}
</style>
import pickle
from IPython.core.display import HTML
from bokeh.plotting import output_notebook
output_notebook()
In order to study the social interactions among users in the same company we have build a directed network representation of our dataset. We consider each employee a node in our network. Given two employees A and B, if the employee A has liked or disliked a comment made by B, an edge from A to B will be added to the network. The Adjacency matrix from the resulting graph will be used to perform a two components NMF clustering using two different values as weights: The total number of interactions and the number of likes divided by the total number of interactions. This clustering will be performed in both the directed and undirected versions of the adjacency matrices, giving rise to eight different graph-based features.
In order to study the social interactions among users in the same company we have build a directed network representation of our dataset. We consider each employee a node in our network. Given two employees A and B, if the employe A has liked or disliked a comment made by B, an edge from A to B will be added to the network.
In the following representation of the dataset the color of each node depends on the target variable churn. A node will be red if it corresponds to an employee who churned, and blue for employees that will not churn in the following twelve weeks.
Each edge is coloured depending on the relative agreement of an interaction. Ranging from red (rel agreement=0) to green (rel_agreement=1), each node has an alpha value proportional to the number of interactions, ranging from 0.5 to 1.
It is also possible to zoom in and hover over the nodes and edges to display a tooltip.
with open('bokeh/interactions_churn.pck','rb') as f:
dataset = pickle.load(f)
HTML(dataset)